148 PART 4 Comparing Groups

Executing a t test

Statistical software packages contain commands that can execute (or run) t tests

(see Chapter 4 for more about these packages). The examples presented here use

R, and in this section, we explain the data structure required for running the var-

ious t tests in R. For demonstration, we use data from the National Health and

Nutrition Examination Survey (NHANES) from 2017–2020 file (available at wwwn.

cdc.gov/nchs/nhanes/continuousnhanes/default.aspx?Cycle=2017-2020).»

» For the one-group t test, you need the column of data containing the

variable whose mean you want to compare to the hypothesized value (H), and

you need to know H. R and other software enable you to specify a value for H

and assumes 0 if you don’t specify anything. In the NHANES data, the fasting

glucose variable is LBXGLU, so the R code to test the mean fasting glucose

against a maximum healthy level of 100 mg/dL in an R dataframe named

GLUCOSE is t.test(GLUCOSE$LBXGLU, mu = 100).»

» For the paired t test, you need two columns of data representing the pair of

numbers you want to enter into the paired t test. For example, in NHANES,

systolic blood pressure (SBP) was measured in the same participant twice

(variables BPXOSY1 and BPXOSY2). To compare these with a paired t test in an

R dataframe named BP, the code is t.test(BP$BPXOSY1, BP$BPXOSY2, paired =

TRUE)

» For the independent t test, you need to have one column coded as the

grouping variable (preferable with a two-state flag coded as 0 and 1), and

another column with the value you want to test. We created a two-state flag in

the NHANES data called MARRIED where 1 = married and 0 = all other marital

statuses. To compare mean fasting glucose level between these two groups in

a dataframe named NHANES, we used this code: t.test(NHANES$LBXGLU ~

NHANES$MARRIED).

TABLE 11-1

How t Tests Calculate Difference, Standard Error, and

Degrees of Freedom

One-Group

Paired

Unpaired t

Equal Variance

Welch t Unequal Variance

D

Difference between mean

of observations and a

hypothesized value (h)

Mean of

paired

differences

Difference between

means of the two

groups

Difference between means of

the two groups

SE

SE of the observations

SE of paired

differences

SE of difference, based

on a pooled estimate of

SD within each group

SE of difference, from SE of

each mean, by propagation of

errors

df

Number of

observations – 1

Number of

pairs – 1

Total number of

observations – 2

“Effective” df, based on the

size and SD of the two groups